AITopics | homologous protein

Collaborating Authors

homologous protein

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening Bowen Gao

Neural Information Processing SystemsFeb-15-2026, 18:34:07 GMT

Following this thought, we recast virtual screening as an information retrieval task, i.e., given a

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Public Health (0.67)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

8bd31288ad8e9a31d519fdeede7ee47d-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 00:57:03 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Public Health (0.67)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Efficiently Predicting Mutational Effect on Homologous Proteins by Evolution Encoding

Zhong, Zhiqiang, Mottin, Davide

arXiv.org Artificial IntelligenceJun-25-2024

Predicting protein properties is paramount for biological and medical advancements. Current protein engineering mutates on a typical protein, called the wild-type, to construct a family of homologous proteins and study their properties. Yet, existing methods easily neglect subtle mutations, failing to capture the effect on the protein properties. To this end, we propose EvolMPNN, Evolution-aware Message Passing Neural Network, an efficient model to learn evolution-aware protein embeddings. EvolMPNN samples sets of anchor proteins, computes evolutionary information by means of residues and employs a differentiable evolution-aware aggregation scheme over these sampled anchors. This way, EvolMPNN can efficiently utilise a novel message-passing method to capture the mutation effect on proteins with respect to the anchor proteins. Afterwards, the aggregated evolution-aware embeddings are integrated with sequence embeddings to generate final comprehensive protein embeddings. Our model shows up to 6.4% better than state-of-the-art methods and attains 36X inference speedup in comparison with large pre-trained models. Code and models are available at https://github.com/zhiqiangzhongddu/EvolMPNN.

information, protein, residue, (15 more...)

arXiv.org Artificial Intelligence

2402.13418

Country:

North America > United States (0.14)
Europe > Denmark (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening

Gao, Bowen, Qiang, Bo, Tan, Haichuan, Ren, Minsi, Jia, Yinjun, Lu, Minsi, Liu, Jingjing, Ma, Weiying, Lan, Yanyan

arXiv.org Artificial IntelligenceOct-10-2023

Virtual screening, which identifies potential drugs from vast compound databases to bind with a particular protein pocket, is a critical step in AI-assisted drug discovery. Traditional docking methods are highly time-consuming, and can only work with a restricted search library in real-life applications. Recent supervised learning approaches using scoring functions for binding-affinity prediction, although promising, have not yet surpassed docking methods due to their strong dependency on limited data with reliable binding-affinity labels. In this paper, we propose a novel contrastive learning framework, DrugCLIP, by reformulating virtual screening as a dense retrieval task and employing contrastive learning to align representations of binding protein pockets and molecules from a large quantity of pairwise data without explicit binding-affinity scores. We also introduce a biological-knowledge inspired data augmentation strategy to learn better protein-molecule representations. Extensive experiments show that DrugCLIP significantly outperforms traditional docking and supervised learning methods on diverse virtual screening benchmarks with highly reduced computation time, especially in zero-shot setting.

molecule, representation, virtual screening, (15 more...)

arXiv.org Artificial Intelligence

2310.06367

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Addressing preferred orientation in single-particle cryo-EM through AI-generated auxiliary particles

Zhang, Hui, Zheng, Dihan, Wu, Qiurong, Yan, Nieng, Shi, Zuoqiang, Hu, Mingxu, Bao, Chenglong

arXiv.org Artificial IntelligenceSep-26-2023

The single-particle cryo-EM field faces the persistent challenge of preferred orientation, lacking general computational solutions. We introduce cryoPROS, an AI-based approach designed to address the above issue. By generating the auxiliary particles with a conditional deep generative model, cryoPROS addresses the intrinsic bias in orientation estimation for the observed particles. We effectively employed cryoPROS in the cryo-EM single particle analysis of the hemagglutinin trimer, showing the ability to restore the near-atomic resolution structure on non-tilt data. Moreover, the enhanced version named cryoPROS-MP significantly improves the resolution of the membrane protein NaX using the no-tilted data that contains the effects of micelles. Compared to the classical approaches, cryoPROS does not need special experimental or image acquisition techniques, providing a purely computational yet effective solution for the preferred orientation problem. Finally, we conduct extensive experiments that establish the low risk of model bias and the high robustness of cryoPROS.

cryopro, homologous protein, particle, (17 more...)

arXiv.org Artificial Intelligence

2309.14954

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Adaptive Residue-wise Profile Fusion for Low Homologous Protein SecondaryStructure Prediction Using External Knowledge

Wang, Qin, Wei, Jun, Wang, Boyuan, Li1, Zhen, Wang, Sheng, Cu, Shuguang

arXiv.org Artificial IntelligenceAug-5-2021

Protein secondary structure prediction (PSSP) is essential for protein function analysis. However, for low homologous proteins, the PSSP suffers from insufficient input features. In this paper, we explicitly import external self-supervised knowledge for low homologous PSSP under the guidance of residue-wise profile fusion. In practice, we firstly demonstrate the superiority of profile over Position-Specific Scoring Matrix (PSSM) for low homologous PSSP. Based on this observation, we introduce the novel self-supervised BERT features as the pseudo profile, which implicitly involves the residue distribution in all native discovered sequences as the complementary features. Further-more, a novel residue-wise attention is specially designed to adaptively fuse different features (i.e.,original low-quality profile, BERT based pseudo profile), which not only takes full advantage of each feature but also avoids noise disturbance. Be-sides, the feature consistency loss is proposed to accelerate the model learning from multiple semantic levels. Extensive experiments confirm that our method outperforms state-of-the-arts (i.e.,4.7%forextremely low homologous cases on BC40 dataset).

homologous protein, low-quality profile, protein, (15 more...)

arXiv.org Artificial Intelligence

2108.04176

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Seq-SetNet: Exploring Sequence Sets for Inferring Structures

Ju, Fusong, Zhu, Jianwei, Wei, Guozheng, Zhang, Qi, Sun, Shiwei, Bu, Dongbo

arXiv.org Machine LearningJun-6-2019

Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it. Almost all of the existing approaches exploit MSAs in an indirect fashion, i.e., they transform MSAs into position-specific scoring matrices (PSSM) that represent the distribution of amino acid types at each column. PSSM could capture column-wise characteristics of MSA, however, the column-wise characteristics embedded in each individual component sequence were nearly totally neglected. The drawback of PSSM is rooted in the fact that an MSA is essentially an unordered sequence set rather than a matrix. Specifically, the interchange of any two sequences will not affect the whole MSA. In contrast, the pixels in an image essentially form a matrix since any two rows of pixels cannot be interchanged. Therefore, the traditional deep neural networks designed for image processing cannot be directly applied on sequence sets. Here, we proposed a novel deep neural network framework (called Seq-SetNet) for sequence set processing. By employing a {\it symmetric function} module to integrate features calculated from preceding layers, Seq-SetNet are immune to the order of sequences in the input MSA. This advantage enables us to directly and fully exploit MSAs by considering each component protein individually. We evaluated Seq-SetNet by using it to extract structural information from MSA for protein secondary structure prediction. Experimental results on popular benchmark sets suggests that Seq-SetNet outperforms the state-of-the-art approaches by 3.6% in precision. These results clearly suggest the advantages of Seq-SetNet in sequence set processing and it can be readily used in a wide range of fields, say natural language processing.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Machine Learning

1906.11196

Country: Asia > China (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Ensembles of Models and Metrics for Robust Ranking of Homologous Proteins

Tomal, Jabed H, Welch, William J, Zamar, Ruben H

arXiv.org Machine LearningJun-21-2017

An ensemble of models (EM), where each model is constructed on a diverse subset of feature variables, is proposed to rank rare class items ahead of majority class items in a highly unbalanced two class problem. The proposed ensemble relies on an algorithm to group the feature variables into subsets where the variables in a subset work better together in a model and the variables in different subsets work better in separate models. The strength of the EM depends on the algorithm's ability to identify strong and diverse subsets of feature variables. A second phase of ensembling is achieved by aggregating several EMs each optimized on a diverse evaluation metric. The resulting ensemble is called ensemble of models and metrics (EMM). Here, the diverse/complementary evaluation metrics ensure increased diversity among EMs to aggregate. The ensembles are applied to the protein homology data, downloaded from the 2004 KDD cup competition website, to rank proteins in such a way that the rare homologous proteins are found ahead of the majority non-homologous proteins. The ensembles are constructed using feature variables which are various scores from sequence alignments of amino acids in a candidate protein and three dimensional descriptions of a native protein representing functional and structural similarity of proteins. While prediction performances of the EMs are better than the contemporary state-of-the-art ensembles and competitive to the winning procedures of the $2004$ KDD cup competition, the performances of the EMM are found on the top of all. In this application, we have two diverse EMs constructed on two complementary evaluation metrics average precision and rank last, where the former is robust against ranking close homologs and the latter is robust against ranking distant homologs. The advantage of using EMM is that it is robust against both close and distant homologs.

data mining, machine learning, protein, (16 more...)

arXiv.org Machine Learning

1706.06971

Country:

North America > Canada > Ontario (0.28)
North America > United States > Texas (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback